effective transfer learning
Leveraging TensorLeap for Effective Transfer Learning: Overcoming Domain Gaps - MarkTechPost
Nowadays, constructing a large-scale dataset is the prerequisite to achieving the task in our hands. Sometimes the task is a niche, and it would be too expensive or even not possible to construct a large-scale dataset for it to train an entire model from scratch. Do we need to train a model from scratch in all cases? Imagine we would like to detect a certain animal, let's say an otter, in images. We first need to collect many otter images and construct a training dataset.
Effective Transfer Learning for Low-Resource Natural Language Understanding
Natural language understanding (NLU) is the task of semantic decoding of human languages by machines. NLU models rely heavily on large training data to ensure good performance. However, substantial languages and domains have very few data resources and domain experts. It is necessary to overcome the data scarcity challenge, when very few or even zero training samples are available. In this thesis, we focus on developing cross-lingual and cross-domain methods to tackle the low-resource issues. First, we propose to improve the model's cross-lingual ability by focusing on the task-related keywords, enhancing the model's robustness and regularizing the representations. We find that the representations for low-resource languages can be easily and greatly improved by focusing on just the keywords. Second, we present Order-Reduced Modeling methods for the cross-lingual adaptation, and find that modeling partial word orders instead of the whole sequence can improve the robustness of the model against word order differences between languages and task knowledge transfer to low-resource languages. Third, we propose to leverage different levels of domain-related corpora and additional masking of data in the pre-training for the cross-domain adaptation, and discover that more challenging pre-training can better address the domain discrepancy issue in the task knowledge transfer. Finally, we introduce a coarse-to-fine framework, Coach, and a cross-lingual and cross-domain parsing framework, X2Parser. Coach decomposes the representation learning process into a coarse-grained and a fine-grained feature learning, and X2Parser simplifies the hierarchical task structures into flattened ones. We observe that simplifying task structures makes the representation learning more effective for low-resource languages and domains.
RadImageNet: Training AI Models With Radiologic vs. Photographic Images
Yang Yang, PhD, Zahi Fayad, PhD, Xueyan Mei, PhD, Timothy Deyer, PhD and colleagues from Icahn School of Medicine at Mount Sinai, University of Oklahoma, and Weill Cornell Medicine conducted a study to evaluate the performance of AI models pretrained on radiologic images compared to photographic images. They created a large-scale, diverse medical imaging dataset to generate CNNs trained only from radiologic images. This is a significant study because the researchers demonstrated that pretraining with radiologic images rather than photographic images may result in more effective transfer learning for radiology AI models. A paper detailing the study entitled RadImageNet: An Open Radiologic Deep Learning Research Dataset for Effective Transfer Learning was published in RSNA Radiology AI on July 27, 2022. Within 10 days of publication, the paper has been downloaded over 1,000 times.
RadImageNet: An Open Radiologic Deep Learning Research Dataset for Effective Transfer Learning
"Just Accepted" papers have undergone full peer review and have been accepted for publication in Radiology: Artificial Intelligence. This article will undergo copyediting, layout, and proof review before it is published in its final version. Please note that during production of the final copyedited article, errors may be discovered which could affect the content. To demonstrate the value of pretraining with millions of radiologic images compared with ImageNet photographic images on downstream medical applications when using transfer learning. This retrospective study included patients who had a radiologic study between 2005 and 2020 at an outpatient imaging facility.
More Effective Transfer Learning for NLP
This spring I presented a talk entitled "Effective Transfer Learning for NLP" at ODSC East. The talk was intended to demonstrate how surprisingly effective pre-trained word and document embeddings are at low training data volumes, and to lay out a set of practical recommendations for applying these techniques to your own tasks. Thanks to some excellent research by Alec Radford and the team at OpenAI, our recommendations are beginning to change. To explain why the tides are shifting, let's first walk through the rubric we use at Indico to evaluate whether or not a novel machine learning method is viable for industry use. Let's see how well pre-trained word document embeddings satisfy these requirements: In short, using pre-trained embeddings is computationally cheap and performs well at the lower extremes of training data availability, but using static representations imposes an unfortunate cap on the benefit gained from additional training data.
More Effective Transfer Learning for NLP - Indico
This spring I presented a talk entitled "Effective Transfer Learning for NLP" at ODSC East. The talk was intended to demonstrate how surprisingly effective pre-trained word and document embeddings are at low training data volumes, and to lay out a set of practical recommendations for applying these techniques to your own tasks. Thanks to some excellent research by Alec Radford and the team at OpenAI, our recommendations are beginning to change. To explain why the tides are shifting, let's first walk through the rubric we use at Indico to evaluate whether or not a novel machine learning method is viable for industry use. Let's see how well pre-trained word document embeddings satisfy these requirements: In short, using pre-trained embeddings is computationally cheap and performs well at the lower extremes of training data availability, but using static representations imposes an unfortunate cap on the benefit gained from additional training data.